CCS Resource Management in Networked HPC Systems
نویسندگان
چکیده
CCS is a resource management system for parallel high-performance computers. At the user level, CCS provides vendor-independent access to parallel systems. At the system administrator level, CCS offers tools for controlling (i.e. specifying, con guring and scheduling) the system components that are operated in a computing center. Hence the name \Computing Center Software". CCS provides: hardware-independent scheduling of interactive and batch jobs, partitioning of exclusive and non-exclusive resources, open, extensible interfaces to other resource management systems, a high degree of reliability (e.g. automatic restart of crashed daemons), fault tolerance in the case of network breakdowns. In this paper, we describe CCS as one important component for the access, job distribution, and administration of networked HPC systems in a metacomputing environment.
منابع مشابه
A Comparison of Job Management Systems in Supporting HPC ClusterTools
This paper compares three most common job management systems and their workings with Sun HPC ClusterTools 3.1. Various aspects such as installation, customization, scheduling and resource control issues are discussed. The three chosen systems are: Load Sharing Facility (LSF), Portable Batch System (PBS) and COmputing in DIstributed Networked Environment (CODINE)/ Global Resource Director (GRD)....
متن کاملAnatomy of a Resource Management System for HPC Clusters
Workstation clusters are often not only used for high-throughput computing in time-sharing mode but also for running complex parallel jobs in space-sharing mode. This poses several difficulties to the resource management system, which must be able to reserve computing resources for exclusive use and also to determine an optimal process mapping for a given system topology. On the basis of our CC...
متن کاملVirtual Resource Management Based Meteorological Computational Grid
Meteorology is one of the main application areas of high performance computing (HPC) technologies, it is impossible for efficient and accurate weather forecasting and meteorological services without the support of HPC resources. As a kind of networked HPC application environment, meteorological computational Grid makes an integration of diverse computing resources using the Grid technology, pro...
متن کاملManaging clusters of geographically distributed high-performance computers
We present a software system for the management of geographically distributed highperformance computers. It consists of three components: 1. The Computing Center Software (CCS) is a vendor-independent resource management software for local HPC systems. It controls the mapping and scheduling of interactive and batch jobs on massively parallel systems; 2. The Resource and Service Description (RSD...
متن کاملScheduling in HPC Resource Management Systems: Queuing vs. Planning
Nearly all existing HPC systems are operated by resource management systems based on the queuing approach. With the increasing acceptance of grid middleware like Globus, new requirements for the underlying local resource management systems arise. Features like advanced reservation or quality of service are needed to implement high level functions like co-allocation. However it is difficult to r...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998